Interactive Discovery of Interesting Subgroup Sets
نویسندگان
چکیده
Although subgroup discovery aims to be a practical tool for exploratory data mining, its wider adoption is hampered by redundancy and the re-discovery of common knowledge. This can be remedied by parameter tuning and manual result filtering, but this requires considerable effort from the data analyst. In this paper we argue that it is essential to involve the user in the discovery process to solve these issues. To this end, we propose an interactive algorithm that allows a user to provide feedback during search, so that it is steered towards more interesting subgroups. Specifically, the algorithm exploits user feedback to guide a diverse beam search. The empirical evaluation and a case study demonstrate that uninteresting subgroups can be effectively eliminated from the results, and that the overall effort required to obtain interesting and diverse subgroup sets is reduced. This confirms that within-search interactivity can be useful for data analysis.
منابع مشابه
Knowledge-intensive subgroup mining: techniques for automatic and interactive discovery
Data mining has proved its significance in various domains and applications. As an important subfield of the general data mining task, subgroup mining can be used, e.g., for marketing purposes in business domains, or for quality profiling and analysis in medical domains. The goal is to efficiently discover novel, potentially useful and ultimately interesting knowledge. However, in real-world si...
متن کاملContrast Mining from Interesting Subgroups
Subgroup discovery methods find interesting subsets of objects of a given class. We propose to extend subgroup discovery by a second subgroup discovery step to find interesting subgroups of objects specific for a class in one or more contrast classes. First, a subgroup discovery method is applied. Then, contrast classes of objects are defined by using set theoretic functions on the discovered s...
متن کاملInteractive Knowledge Frontier Discovery with COBWEB-KFD
Knowledge frontier discovery is a novel technique for identifying interesting subpopulations of a dataset with respect to classification performance. A knowledge frontier is a collection of meaningful groups where any sub-partition with significantly different predictive accuracy is not meaningful. This research introduces knowledge frontiers and knowledge frontier discovery. The first knowledg...
متن کاملNon-redundant Subgroup Discovery in Large and Complex Data
Large and complex data is challenging for most existing discovery algorithms, for several reasons. First of all, such data leads to enormous hypothesis spaces, making exhaustive search infeasible. Second, many variants of essentially the same pattern exist, due to (numeric) attributes of high cardinality, correlated attributes, and so on. This causes top-k mining algorithms to return highly red...
متن کاملSubgroup Analytics and Interactive Assessment on Ubiquitous Data
This paper applies subgroup discovery for obtaining interesting descriptive patterns in ubiquitous data. Furthermore, we provide a novel graph-based analysis approach for assessing the relations between the obtained subgroup set, and for comparing subgroups according to their relations to other subgroups. We present and discuss first results utilizing real-world data, given by noise measurement...
متن کامل